Movie/Script: Alignment and Parsing of Video and Text Transcription
Identifieur interne : 000C09 ( Main/Exploration ); précédent : 000C08; suivant : 000C10Movie/Script: Alignment and Parsing of Video and Text Transcription
Auteurs : Timothee Cour [États-Unis] ; Chris Jordan [États-Unis] ; Eleni Miltsakaki [États-Unis] ; Ben Taskar [États-Unis]Source :
- Lecture Notes in Computer Science [ 0302-9743 ] ; 2008.
Abstract
Abstract: Movies and TV are a rich source of diverse and complex video of people, objects, actions and locales “in the wild”. Harvesting automatically labeled sequences of actions from video would enable creation of large-scale and highly-varied datasets. To enable such collection, we focus on the task of recovering scene structure in movies and TV series for object tracking and action retrieval. We present a weakly supervised algorithm that uses the screenplay and closed captions to parse a movie into a hierarchy of shots and scenes. Scene boundaries in the movie are aligned with screenplay scene labels and shots are reordered into a sequence of long continuous tracks or threads which allow for more accurate tracking of people, actions and objects. Scene segmentation, alignment, and shot threading are formulated as inference in a unified generative model and a novel hierarchical dynamic programming algorithm that can handle alignment and jump-limited reorderings in linear time is presented. We present quantitative and qualitative results on movie alignment and parsing, and use the recovered structure to improve character naming and retrieval of common actions in several episodes of popular TV series.
Url:
DOI: 10.1007/978-3-540-88693-8_12
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream Istex, to step Corpus: 001E85
- to stream Istex, to step Curation: 001D55
- to stream Istex, to step Checkpoint: 000676
- to stream Main, to step Merge: 000C21
- to stream Main, to step Curation: 000C09
Le document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Movie/Script: Alignment and Parsing of Video and Text Transcription</title>
<author><name sortKey="Cour, Timothee" sort="Cour, Timothee" uniqKey="Cour T" first="Timothee" last="Cour">Timothee Cour</name>
</author>
<author><name sortKey="Jordan, Chris" sort="Jordan, Chris" uniqKey="Jordan C" first="Chris" last="Jordan">Chris Jordan</name>
</author>
<author><name sortKey="Miltsakaki, Eleni" sort="Miltsakaki, Eleni" uniqKey="Miltsakaki E" first="Eleni" last="Miltsakaki">Eleni Miltsakaki</name>
</author>
<author><name sortKey="Taskar, Ben" sort="Taskar, Ben" uniqKey="Taskar B" first="Ben" last="Taskar">Ben Taskar</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:4D113318F9911978071D0A7B8FD0031994AF3C74</idno>
<date when="2008" year="2008">2008</date>
<idno type="doi">10.1007/978-3-540-88693-8_12</idno>
<idno type="url">https://api.istex.fr/document/4D113318F9911978071D0A7B8FD0031994AF3C74/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">001E85</idno>
<idno type="wicri:Area/Istex/Curation">001D55</idno>
<idno type="wicri:Area/Istex/Checkpoint">000676</idno>
<idno type="wicri:doubleKey">0302-9743:2008:Cour T:movie:script:alignment</idno>
<idno type="wicri:Area/Main/Merge">000C21</idno>
<idno type="wicri:Area/Main/Curation">000C09</idno>
<idno type="wicri:Area/Main/Exploration">000C09</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Movie/Script: Alignment and Parsing of Video and Text Transcription</title>
<author><name sortKey="Cour, Timothee" sort="Cour, Timothee" uniqKey="Cour T" first="Timothee" last="Cour">Timothee Cour</name>
<affiliation wicri:level="2"><country xml:lang="fr">États-Unis</country>
<wicri:regionArea>University of Pennsylvania, 19104, Philadelphia, PA</wicri:regionArea>
<placeName><region type="state">Pennsylvanie</region>
</placeName>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">États-Unis</country>
</affiliation>
</author>
<author><name sortKey="Jordan, Chris" sort="Jordan, Chris" uniqKey="Jordan C" first="Chris" last="Jordan">Chris Jordan</name>
<affiliation wicri:level="2"><country xml:lang="fr">États-Unis</country>
<wicri:regionArea>University of Pennsylvania, 19104, Philadelphia, PA</wicri:regionArea>
<placeName><region type="state">Pennsylvanie</region>
</placeName>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">États-Unis</country>
</affiliation>
</author>
<author><name sortKey="Miltsakaki, Eleni" sort="Miltsakaki, Eleni" uniqKey="Miltsakaki E" first="Eleni" last="Miltsakaki">Eleni Miltsakaki</name>
<affiliation wicri:level="2"><country xml:lang="fr">États-Unis</country>
<wicri:regionArea>University of Pennsylvania, 19104, Philadelphia, PA</wicri:regionArea>
<placeName><region type="state">Pennsylvanie</region>
</placeName>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">États-Unis</country>
</affiliation>
</author>
<author><name sortKey="Taskar, Ben" sort="Taskar, Ben" uniqKey="Taskar B" first="Ben" last="Taskar">Ben Taskar</name>
<affiliation wicri:level="2"><country xml:lang="fr">États-Unis</country>
<wicri:regionArea>University of Pennsylvania, 19104, Philadelphia, PA</wicri:regionArea>
<placeName><region type="state">Pennsylvanie</region>
</placeName>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">États-Unis</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>2008</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">4D113318F9911978071D0A7B8FD0031994AF3C74</idno>
<idno type="DOI">10.1007/978-3-540-88693-8_12</idno>
<idno type="ChapterID">12</idno>
<idno type="ChapterID">Chap12</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: Movies and TV are a rich source of diverse and complex video of people, objects, actions and locales “in the wild”. Harvesting automatically labeled sequences of actions from video would enable creation of large-scale and highly-varied datasets. To enable such collection, we focus on the task of recovering scene structure in movies and TV series for object tracking and action retrieval. We present a weakly supervised algorithm that uses the screenplay and closed captions to parse a movie into a hierarchy of shots and scenes. Scene boundaries in the movie are aligned with screenplay scene labels and shots are reordered into a sequence of long continuous tracks or threads which allow for more accurate tracking of people, actions and objects. Scene segmentation, alignment, and shot threading are formulated as inference in a unified generative model and a novel hierarchical dynamic programming algorithm that can handle alignment and jump-limited reorderings in linear time is presented. We present quantitative and qualitative results on movie alignment and parsing, and use the recovered structure to improve character naming and retrieval of common actions in several episodes of popular TV series.</div>
</front>
</TEI>
<affiliations><list><country><li>États-Unis</li>
</country>
<region><li>Pennsylvanie</li>
</region>
</list>
<tree><country name="États-Unis"><region name="Pennsylvanie"><name sortKey="Cour, Timothee" sort="Cour, Timothee" uniqKey="Cour T" first="Timothee" last="Cour">Timothee Cour</name>
</region>
<name sortKey="Cour, Timothee" sort="Cour, Timothee" uniqKey="Cour T" first="Timothee" last="Cour">Timothee Cour</name>
<name sortKey="Jordan, Chris" sort="Jordan, Chris" uniqKey="Jordan C" first="Chris" last="Jordan">Chris Jordan</name>
<name sortKey="Jordan, Chris" sort="Jordan, Chris" uniqKey="Jordan C" first="Chris" last="Jordan">Chris Jordan</name>
<name sortKey="Miltsakaki, Eleni" sort="Miltsakaki, Eleni" uniqKey="Miltsakaki E" first="Eleni" last="Miltsakaki">Eleni Miltsakaki</name>
<name sortKey="Miltsakaki, Eleni" sort="Miltsakaki, Eleni" uniqKey="Miltsakaki E" first="Eleni" last="Miltsakaki">Eleni Miltsakaki</name>
<name sortKey="Taskar, Ben" sort="Taskar, Ben" uniqKey="Taskar B" first="Ben" last="Taskar">Ben Taskar</name>
<name sortKey="Taskar, Ben" sort="Taskar, Ben" uniqKey="Taskar B" first="Ben" last="Taskar">Ben Taskar</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000C09 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000C09 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Main |étape= Exploration |type= RBID |clé= ISTEX:4D113318F9911978071D0A7B8FD0031994AF3C74 |texte= Movie/Script: Alignment and Parsing of Video and Text Transcription }}
This area was generated with Dilib version V0.6.32. |